CLARIFY: Human-Powered Training of SMT Models
نویسندگان
چکیده
We present CLARIFY, an augmented environment that aims to improve the quality of translations generated by phrase-based statistical machine translation systems through learning from humans. CLARIFY employs four types of knowledge input: 1) direct input 2) results from either a word alignment game, 3) an phrase alignment game, or 4) a paraphrasing game. All of these knowledge inputs elicit user knowledge to improve future system translations.
منابع مشابه
Feasibility Study of Building a Human Powered Hydrofoil Vessel
In this paper, a feasibility study of building a Human Powered Hydrofoil (HPH) vessel is reported. Hydrofoil vessels are a well-known class of high-speed crafts. In addition to high-speed operation, the hydrofoils have a reliable maneuvering capability, good stability and proper operation in waves. Also, a human powered vehicle, nowadays is an advancing idea. Different aspects of the design and...
متن کاملApplying Morphology Generation Models to Machine Translation
We improve the quality of statistical machine translation (SMT) by applying models that predict word forms from their stems using extensive morphological and syntactic information from both the source and target languages. Our inflection generation models are trained independently of the SMT system. We investigate different ways of combining the inflection prediction component with the SMT syst...
متن کاملFeature Decay Algorithms for Fast Deployment of Accurate Statistical Machine Translation Systems
We use feature decay algorithms (FDA) for fast deployment of accurate statistical machine translation systems taking only about half a day for each translation direction. We develop parallel FDA for solving computational scalability problems caused by the abundance of training data for SMT models and LM models and still achieve SMT performance that is on par with using all of the training data ...
متن کاملPanDoRA: A Large-scale Two-way Statistical Machine Translation System for Hand-held Devices
The statistical machine translation (SMT) approach has taken a lead place in the field of Machine Translation for its better translation quality and lower cost in training compared to other approaches. However, due to the high demand of computing resources, an SMT system can not be directly run on hand-held devices. Most existing hand-held translation systems are either interlingua-based, which...
متن کاملStacking for Statistical Machine Translation
We propose the use of stacking, an ensemble learning technique, to the statistical machine translation (SMT) models. A diverse ensemble of weak learners is created using the same SMT engine (a hierarchical phrase-based system) by manipulating the training data and a strong model is created by combining the weak models on-the-fly. Experimental results on two language pairs and three different si...
متن کامل